{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ac3f714d-c479-4037-ab16-32835443167d",
   "metadata": {
    "id": "XIyP_0r6zuVc"
   },
   "source": [
    "<!-- Banner Image -->\n",
    "<img src=\"https://uohmivykqgnnbiouffke.supabase.co/storage/v1/object/public/landingpage/brevdevnotebooks.png\" width=\"100%\">\n",
    "\n",
    "<!-- Links -->\n",
    "<center>\n",
    "  <a href=\"https://console.brev.dev\" style=\"color: #06b6d4;\">Console</a> •\n",
    "  <a href=\"https://brev.dev\" style=\"color: #06b6d4;\">Docs</a> •\n",
    "  <a href=\"/\" style=\"color: #06b6d4;\">Templates</a> •\n",
    "  <a href=\"https://discord.gg/NVDyv7TUgJ\" style=\"color: #06b6d4;\">Discord</a>\n",
    "</center>\n",
    "\n",
    "# Create your own Chatbot 🤙\n",
    "\n",
    "Welcome!\n",
    "\n",
    "**ChatGPT Plus has paused signups, so we wanted to share how you can easily run your own chat GPT.**\n",
    "\n",
    "In this notebook, we will create a chatbot using [**Zephyr 7B**](https://huggingface.co/papers/2310.16944), a popular chat model which was fine-tuned on [Mistral 7B](https://huggingface.co/mistralai/Mistral-7B-v0.1). Zephyr 7B Beta, made by the [Hugging Face H4 team](https://huggingface.co/HuggingFaceH4), **outperforms GPT 3.5, LLama 2 70B (& smaller), and all other 7B models** [[Alpaca Leaderboard](https://tatsu-lab.github.io/alpaca_eval/)]. Plus, it's **25x smaller than GPT 3.5**!\n",
    "\n",
    "You'll be able to swap out the model if you'd like with another chat/instruct model.\n",
    "\n",
    "### Help us make this tutorial better! Please provide feedback on the [Discord channel](https://discord.gg/y9428NwTh3) or on [X](https://x.com/harperscarroll).\n",
    "\n",
    "A note about running Jupyter Notebooks: Press Shift + Enter to run a cell. A * in the left-hand cell box means the cell is running. A number means it has completed. If your Notebook is acting weird, you can interrupt a too-long process by interrupting the kernel (Kernel tab -> Interrupt Kernel) or even restarting the kernel (Kernel tab -> Restart Kernel). Note restarting the kernel will require you to run everything from the beginning.\n",
    "\n",
    "If you ever have trouble importing something from Hugging Face, you may need to run `huggingface-cli login` in a shell. To open a shell in Jupyter Lab, click on 'Launcher' (or the '+' if it's not there) next to the notebook tab at the top of the screen. Under \"Other\", click \"Terminal\" and then run the command.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41bda99e-494a-4f6d-a698-00d0250263d8",
   "metadata": {
    "id": "hWI-uRLEyRgb"
   },
   "source": [
    "## Let's begin!\n",
    "\n",
    "I used a GPU from [brev.dev](https://brev.dev). I used an A10G, with 24GB GPU Memory, 32 GB RAM, 256 GB storage. This machine is about $1/hr. \n",
    "\n",
    "Click the badge below to get your preconfigured instance:\n",
    "\n",
    "[![ Click here to deploy.](https://uohmivykqgnnbiouffke.supabase.co/storage/v1/object/public/landingpage/brevdeploynavy.svg)](https://console.brev.dev/environment/new?instance=A10G:g5.2xlarge&diskStorage=256&name=chatbot&file=https://github.com/brevdev/notebooks/raw/main/llama2-finetune.ipynb&python=3.10&cuda=12.0.1)\n",
    "\n",
    "Once you've checked out your machine and landed in your instance page, select the specs you'd like  (I used **Python 3.10 and CUDA 12.0.1**; these should be preconfigured for you if you use the badge above) and click the \"Build\" button to build your verb container. Give this a few minutes.\n",
    "\n",
    "A few minutes after your model has started Running, click the 'Notebook' button on the top right of your screen once it illuminates (you may need to refresh the screen). You will be taken to a Jupyter Lab environment, where you can upload this Notebook.\n",
    "\n",
    "Note: You can connect your cloud credits (AWS or GCP) by clicking \"Org: \" on the top right, and in the panel that slides over, click \"Connect AWS\" or \"Connect GCP\" under \"Connect your cloud\" and follow the instructions linked to attach your credentials."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "930c942b-603b-4ded-9905-a8da8276953d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.0.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.3.1\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.0.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.3.1\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "# You only need to run this once per machine, even if you re-start the kernel or machine\n",
    "# Remove the -q to print install output\n",
    "!pip install -q -U git+https://github.com/huggingface/transformers.git\n",
    "!pip install -q -U accelerate jupyter ipywidgets"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25fa6b03",
   "metadata": {},
   "source": [
    "Here we set the model. Feel free to replace it with any Hugging Face chat/instruct model (just check to use appropriate \"roles\" as defined below)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "5cdcf668",
   "metadata": {},
   "outputs": [],
   "source": [
    "models_dict = {\n",
    "    \"Zephyr 7B Alpha\": \"HuggingFaceH4/zephyr-7b-alpha\", \n",
    "    \"Zephyr 7B Beta\": \"HuggingFaceH4/zephyr-7b-beta\", \n",
    "    # You can add more chat/instruct models here \n",
    "}\n",
    "\n",
    "model_id = models_dict[\"Zephyr 7B Beta\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "5b1d1e2e-549b-4fee-be9a-7819096f0c19",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Loading checkpoint shards: 100%|██████████| 8/8 [01:32<00:00, 11.60s/it]\n",
      "(…)beta/resolve/main/generation_config.json: 100%|██████████| 111/111 [00:00<00:00, 694kB/s]\n",
      "(…)-beta/resolve/main/tokenizer_config.json: 100%|██████████| 1.43k/1.43k [00:00<00:00, 9.26MB/s]\n",
      "tokenizer.model: 100%|██████████| 493k/493k [00:00<00:00, 330MB/s]\n",
      "(…)phyr-7b-beta/resolve/main/tokenizer.json: 100%|██████████| 1.80M/1.80M [00:00<00:00, 26.8MB/s]\n",
      "(…)r-7b-beta/resolve/main/added_tokens.json: 100%|██████████| 42.0/42.0 [00:00<00:00, 254kB/s]\n",
      "(…)eta/resolve/main/special_tokens_map.json: 100%|██████████| 168/168 [00:00<00:00, 938kB/s]\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "from transformers import pipeline\n",
    "\n",
    "# Running this may take a minute\n",
    "pipe = pipeline(\"text-generation\", model=model_id, torch_dtype=torch.bfloat16, device_map=\"auto\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ae3f4db2-f384-4978-8a1e-663077c16c8e",
   "metadata": {},
   "source": [
    "Now define the role you'd like the chatbot to assume."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "03820334-f703-4ff8-89ee-b427cede78fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "job_description = \"You are a machine learning expert. Please answer my questions.\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4dead056-02a8-4500-b931-040301d88dc6",
   "metadata": {},
   "source": [
    "Below we run the model! Use \"stop\" or \"exit\" to exit question answering generation. You can ignore the `UserWarning: You have modified the pretrained model configuration...` warning."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "43b103d4-3323-4bd6-9c6a-47dc19330d17",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Question:  How much does it cost to train an LLM?\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "The cost of training a large language model (LLM) using machine learning techniques can vary significantly depending on several factors. Here are some factors that can influence the cost:\n",
      "\n",
      "1. Hardware: Training an LLM requires a significant amount of computing power. The cost of training will depend on the type of hardware used, the size of the model, and the length of the training run. Using cloud computing services like Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure can increase the cost, as these services charge based on usage.\n",
      "\n",
      "2. Dataset: The size and quality of the dataset used to train the LLM can significantly affect the cost. Large datasets like the Common Crawl or the WebText dataset can require petabytes of storage and significant computing resources, which can drive up the cost.\n",
      "\n",
      "3. Model Architecture: The specific architecture of the LLM, such as the number of layers, the type of activation functions, and the size of the model's parameters, can affect the cost of training.\n",
      "\n",
      "4. Optimization Algorithm: The choice of optimization algorithm can also affect the cost of training. Some optimization algorithms, like Adam, can converge faster, reducing the number of training iterations required and lowering the cost.\n",
      "\n",
      "5. Preprocessing and Data Cleaning: Preprocessing and data cleaning steps can also contribute to the cost. These steps involve removing noise, duplicates, and irrelevant data, which can be time-consuming and resource-intensive.\n",
      "\n",
      "6. Model Evaluation: After training, the LLM needs to be evaluated on a separate set of data to determine its performance. This evaluation process can add to the cost of training.\n",
      "\n",
      "In general, training a state-of-the-art LLM like GPT-3 or BERT can cost several hundred thousand dollars, if not millions of dollars, depending on the specific factors listed above. However, the cost of training is decreasing as hardware becomes more powerful and cloud computing services become more affordable. Additionally, some organizations and research institutions are collaborating to share resources and reduce the cost of training LLMs.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Question:  Can you expand on point #4?\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Sure! Preprocessing and data cleaning are crucial steps in the machine learning pipeline, especially when working with large datasets. Preprocessing involves transforming the raw data into a format that the machine learning algorithm can understand, while data cleaning involves removing any unwanted or irrelevant data, handling missing values, and dealing with outliers.\n",
      "\n",
      "Preprocessing steps can include tasks like tokenization, which involves breaking down text into individual words or tokens, and stemming or lemmatization, which involves reducing words to their base or root forms. These steps can help the machine learning algorithm better understand the meaning of the text and make more accurate predictions.\n",
      "\n",
      "Data cleaning is important because real-world data is often messy and contains errors, inconsistencies, and missing values. Data cleaning involves removing any duplicates, handling missing values, and dealing with outliers, which can significantly improve the accuracy and performance of the machine learning algorithm.\n",
      "\n",
      "Preprocessing and data cleaning can be time-consuming and resource-intensive, especially when dealing with large datasets. For example, tokenizing a large dataset can require significant computing resources, and cleaning the data can involve complex algorithms to handle missing values and outliers.\n",
      "\n",
      "In summary, preprocessing and data cleaning are critical steps in the machine learning pipeline, but they can also contribute to the cost of training an LLM, especially when dealing with large datasets. However, these steps are necessary to ensure the accuracy and performance of the machine learning algorithm.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Question:  Can you expand on point #6?\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Certainly! After training the LLM, it's essential to evaluate its performance on a separate, unseen dataset to ensure that it can generalize well to new, previously unseen data. This process is called model evaluation, and it's an essential step in the machine learning pipeline.\n",
      "\n",
      "Model evaluation involves measuring the LLM's performance on a separate set of data, which wasn't used during the training process. This process helps to ensure that the LLM can generalize well to new, previously unseen data and that it doesn't overfit to the training data.\n",
      "\n",
      "There are several evaluation metrics used to measure the LLM's performance, such as accuracy, precision, recall, and F1 score. These metrics help to determine how well the LLM is able to classify or predict the correct output for a given input.\n",
      "\n",
      "Model evaluation can be a time-consuming process, especially when dealing with large datasets. The evaluation process involves splitting the dataset into a training set, a validation set, and a test set. The model is trained on the training set, and its performance is evaluated on the validation set to determine whether further training is necessary. Once the model has converged, it's evaluated on the test set to determine its final performance.\n",
      "\n",
      "In summary, model evaluation is a crucial step in the machine learning pipeline, and it can contribute to the cost of training an LLM, especially when dealing with large datasets. However, it's necessary to ensure that the LLM can generalize well to new, previously unseen data and that it's not overfitting to the training data.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Question:  stop\n"
     ]
    }
   ],
   "source": [
    "messages = [\n",
    "    {\n",
    "        \"role\": \"system\",\n",
    "        \"content\": job_description,\n",
    "    },\n",
    "]\n",
    "    \n",
    "exit_terms = [\"stop\", \"exit\"]\n",
    "\n",
    "while True:\n",
    "    question = input(\"\\nQuestion: \")\n",
    "    if question.lower() in exit_terms: \n",
    "        break\n",
    "    messages.append({\"role\": \"user\", \"content\": question})\n",
    "    prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\n",
    "    outputs = pipe(prompt, max_new_tokens=512, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)\n",
    "    output = outputs[0][\"generated_text\"]\n",
    "    messages.append({\"role\": \"assistant\", \"content\": output})\n",
    "    response_start = output.rfind('<|assistant|>')\n",
    "    print(output[response_start + len('<|assistant|>'):])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dbd51c7a-80ad-416b-9268-e2fc1c367ad1",
   "metadata": {},
   "source": [
    "### Sweet... it worked! Epic!!!\n",
    "\n",
    "I hope you enjoyed this tutorial on building your own Chatbot. Please join the community on [Discord](https://discord.gg/y9428NwTh3)! \n",
    "\n",
    "If you have any questions, please reach out to me on [X](https://x.com/harperscarroll) or in the [Discord](https://discord.gg/y9428NwTh3).\n",
    "\n",
    "🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙 🤙"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
